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Abstract 

We study sparse approximation by greedy algorithms. Our con- 
tribution is two-fold. First, we prove exact recovery with high prob- 
ability of random ii'-sparse signals within \K{1 + e)] iterations of 
the Orthogonal Matching Pursuit (OMP). This result shows that in a 
probabilistic sense the OMP is almost optimal for exact recovery. Sec- 
ond, we prove the Lebesgue-type inequalities for the Weak Chebyshev 
Greedy Algorithm, a generalization of the Weak Orthogonal Match- 
ing Pursuit to the case of a Banach space. The main novelty of these 
results is a Banach space setting instead of a Hilbert space setting. 
However, even in the case of a Hilbert space our results add some 
new elements to known results on the Lebesque-type inequalities for 
the RIP dictionaries. Our technique is a development of the recent 
technique created by Zhang. 
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1 Introduction 



This paper deals with sparse approximation. Driven by apphcations in bi- 
ology, medicine, and engineering approximation problems are formulated in 
very high dimensions, which bring to the fore new phenomena. One aspect 
of the high- dimensional context is a focus on sparse signals (functions). The 
main motivation for the study of sparse signals is that many real world sig- 
nals can be well approximated by sparse ones. A very important step in 
solving multivariate problems with large dimension occurred during last 20 
years. Researchers began to use sparse representations as a way to model 
the corresponding function classes. This approach automatically implies a 
need for nonlinear approximation, in particular, for greedy approximation. 
We give a brief description of a sparse approximation problem. In a general 
setting we are working in a Banach space X with a redundant system of 
elements T> (dictionary T>). There is a sohd justification of importance of a 
Banach space setting in numerical analysis in general and in sparse approx- 
imation in particular (see, for instance, [?], Preface, and [?]). An element 
(function, signal) / G X is said to be _ftr-sparsc with respect to V if it has a 
representation / = XiQi, gi E V, i = 1, . . . , K . The set of all i^-sparse 
elements is denoted by Ei^(I>). For a given element /o we introduce the error 
of best m-term approximation 

aUfo,V):= ini \\fo-f\\. 

Here are two fundamental problems of sparse approximation. 

PI. Exact recovery. Suppose we know that /o G T,k{T^)- How can we 

recover it? 

P2. Approximate recovery. How to design a practical algorithm that 
builds m-term approximations comparable to best m-term approximations? 

It is known that in both of the above problems greedy-type algorithms 
play a fundamental role. We discuss one of them here. There are two special 
cases of the above general setting of the sparse approximation problem. 

(I) . Instead of a Banach space X we consider a Hilbcrt space H. Approx- 
imation is still with respect to a redundant dictionary V. 

(II) . We approximate in a Banach space X with respect to a basis ^ 
instead of a redundant dictionary V. 

This section discusses setting (I) and the corresponding generalizations 
to the Banach space setting. Section 4 addresses setting (II). We begin our 
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discussion with the Orthogonal Greedy Algorithm (OGA) in a Hilbert space. 
The Orthogonal Greedy Algorithm is called the Orthogonal Matching Pur- 
suit (OMP) in signal processing. We will use the name Orthogonal Matching 
Pursuit for this algorithm in this paper. It is natural to compare perfor- 
mance of the OMP with the best m-term approximation with regard to a 
dictionary V. We recall some notations and definitions from the theory of 
greedy algorithms. Let H he a real Hilbert space with an inner product (■, ■) 
and the norm ||a;|| := {x,x)^^'^. We say a set V of functions (elements) from 
if is a dictionary if each g & T> has a unit norm {\\g\\ = 1) and the closure 
of spanD is H. Let a sequence r = < < 1, be given. The fol- 

lowing greedy algorithm was defined in [?] under the name Weak Orthogonal 
Greedy Algorithm (WOGA). 

Weak Orthogonal Matching Pursuit (WOMP). Let /o be given. 
Then for each m > 1 we inductively define: 

(1) fm e 2^ is any element satisfying 

|(/m-l,<^m)| >trriSUp\{fm-l,g)\- 

(2) Let Hjn '■= span(<^i, . . . , <^^) and let PHmi') denote an operator of 
orthogonal projection onto H^. Define 

(3) Define the residual after mth iteration of the algorithm 

fm '■= fo — Gmifo, T^)- 

In the case t,fc = l,/c = l,2,..., WOMP is called the Orthogonal Matching 
Pursuit (OMP). In this paper we only consider the case ti^ — k — 1,2,..., 

te (0,1]. 

The theory of the WOMP is well developed (see [?]). In first results on 
performance of the WOMP in problems PI and P2 researchers imposed the 
incoherence assumption on a dictionary T). The reader can find detailed 
discussion of these results in [?], Section 2.6 and [?]. Recently, exact recov- 
ery results and Lebesgue-type inequalities for the WOMP under assumption 
that V satisfies Restricted Isometry Property (RIP) introduced in compressed 
sensing theory (see Definition 2.1 below) have been proved (see [?], [?], [?]). 
A breakthrough result in this direction was obtained by Zhang [?]. In par- 
ticular, he proved that if 5fl^{V) < 1/3 then the OMP recovers exactly all 
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X-sparse signals within 30K iterations. In other words, f^ox = 0. It is in- 
teresting and difiicuh problem to improve the constant 30. There are several 
papers devoted to this problem (see [?] and [?]). In this paper we develop 
Zhang's technique in two directions: (1) to obtain exact recovery with high 
probabihty of random X-sparse signals within \K{1 + e)] iterations of the 
OMP and (2) to obtain recovery results and the Lebesgue-type inequalities 
in the Banach space setting. 

In Section 2 we prove exact recovery results under RIP conditions on a 
dictionary combined with assumptions on the sparse signal to be recovered 
(see Theorem 2.1). We prove that the corresponding assumptions on a sparse 
signal are satisfied with high probability if it is a random signal. In particular, 
we prove the following theorem. 

Theorem 1.1. For any e > there exist d = 5(e) > and Kq — Ko(e) such 
that for any dictionary V, < ^> K > Kq, the following statement 

holds. Let /o G S/^(D) and its nonzero coefficients are uniformly distributed 
on [—1, 1] independent random variables. Then f\K{i+t)'\ = with probability 
greater than 1 — exp(— C(e)ii'). 

This theorem shows that in a probabilistic sense the OMP is almost op- 
timal for exact recovery. 

Sections 3 is devoted to the Banach space setting. Let X be a Banach 
space with norm || ■ |h= || ■ As in the case of Hilbert spaces we say that a 
set of elements (functions) T) from X is a dictionary if each g has norm 
one (II (7 II = 1), and the closure of spanD is X. For a nonzero element g E X 
we let Fg denote a norming (peak) functional for g: 

\\Fg\\x* = 1, Fg{g) = \\g\\x. 

The existence of such a functional is guaranteed by the Hahn-Banach theo- 
rem. 

Let r := {tk}'^=i be a given weakness sequence of nonnegative num- 
bers tk < 1, k — 1, We define the Weak Chebyshev Greedy Algorithm 

(WCGA) (see [?]) as a generalization for Banach spaces of the Weak Orthog- 
onal Matching Pursuit. We study in detail the WCGA in this paper. 

Weak Chebyshev Greedy Algorithm (WCGA). Let /o be given. 
Then for each m > 1 we have the following inductive definition. 

(1) (fim '■— (fi^ e P is any element satisfying 

\Ffm-iiVm)\ > tm sup \Ff^_,{g)\. 

gev 
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(2) Define 

:= := span{(^,}f^i, 

and define Gm '■— to be the best approximant to /o from 

(3) Let 

fm ■ frn ■ /o G^m- 

In Section 3 we prove the Lebesgue-type inequahties for the WCGA. 
A very important advantage of the WCGA is its convergence and rate of 
convergence properties. The WCGA is well defined for all m. Moreover, it 
is known (see [?] and [?]) that the WCGA with r = {t} converges for all /o 
in all uniformly smooth Banach spaces with respect to any dictionary. That 
is, when X is a real Banach space and the modulus of smoothness of X is 
defined as follows 

^ a:,J/;||a:|| = ||2/||=l 

then the uniformly smooth Banach space is the one with p{u) /it — > when 

For notational convenience we consider here a countable dictionary T) — 
{9i}iZi- For ^ given /o, let the sparse element (signal) 

be such that ||/o — /'^|| < e and |T| = K. For A (ZT denote 

/a := /I := ^XiQi. 

We use the following two assumptions. 

Al. Nikol'skii-type inequality. The sparse element / = YliieT^idi 
satisfies Nikol'skii-type liX inequality with parameter r if 

Y,\^i\<CiW\\fAl Act, r>l/2. 

A2. Incoherence property. The sparse element / = Y2ieT^i9i 
incoherence property with parameters D and U if for any A <ZT and any A, 
such that A n A = and \A\ + |A| < D, we have for any {q} 

\\fA-J2^^9i\\>U-'\\fA\\. 

The main result of Section 3 is the following. 



5 



Theorem 1.2. Let X be a Banach space with p{u) < ^jv^ . Suppose K-sparse 
satisfies Al, A2 and ||/o — /^|| < e. Then the WCGA with weakness 
parameter t applied to fo provides 

\\fcit,j,c,)u^Hu+i)K^r\\<Ce for K + C{tn,Ci)UHn{U + 1)K^' < D 

with an absolute constant C . 

Theorem 1.2 provides a corollary for Hilbert spaces that gives sufficient 
conditions somewhat weaker than the known RIP conditions on V for the 
Lebesgue-type inequality to hold. We formulate it as a theorem. 

Theorem 1.3. Let X be a Hilbert space. Suppose K-sparse satisfies A2 
and II /o — fW < e. Then the WOMP with weakness parameter t applied to 
fo provides 

\\fc(t,u)K\\<Ce for K + C{t,U)K<D 
with an absolute constant C . 

Theorem 1.3 implies the following corollary. 

CoroUciry 1.1. Let X he a Hilbert space. Suppose any K-sparse f satisfies 
A 2. Then the WOMP with weakness parameter t applied to fo provides 

\\fc(t,u)K\\<CaK{fo,V) for K + C{t,U)K<D 

with an absolute constant C. 

We show in Sections 3 that the RIP condition with parameters D and 5 
implies the (D, D) unconditionality with U = (1+5)^2(1-5)-^/^ Therefore, 
Corollary 1.1 reads as follows in this case. 

Corollary 1.2. Let X be a Hilbert space. Suppose T> satisfies RIP condi- 
tion with parameters D and 5. Then the WOMP with weakness parameter t 
applied to fo provides 

\\fcit,s)K\\<CcrKUo,'D) for K + C{t,6)K<D 

with an absolute constant C. 
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We emphasize that in Theorem 1.2 wc impose our conditions on an in- 
dividual function f^. It may happen that the dictionary does not satisfy 
assumptions of iiX inequahty and {K, D)-unconditionahty (see Section 3) 
but the given /o can be approximated by f which does satisfy assumptions 
Al and A2. Even in the case of a Hilbert space our approach adds some- 
thing new to the study based on the RIP. First of all, Theorem 1.3 shows 
that it is sufficient to impose assumption A2 on an individual in order 
to obtain exact recovery and the Lebesgue-type inequality results. Second, 
Corollary 1.1 shows that the condition A2, which is weaker than the RIP 
condition, is sufficient for exact recovery and the Lebesgue-type inequality 
results. Third, Corollary 1.2 shows that even if we impose our assumptions 
in terms of RIP we do not need to assume that S < Sq. In fact, the result 
works for all S < 1 with parameters depending on S. 

2 Almost optimality of the OMP 

We prove Theorem 1.1 in this section. For the readers convenience we use 
notations which are standard in signal processing. Let V — be a 

dictionary in M^, M < N. By $ denote an M x matrix, consisting of 
elements of V {(pi G is the i-th column of We say that x G is 
S'-sparse if x has at most S nonzero coordinates. 

Definition 2.1. A matrix $ satisfies RIP{S,S) if the inequality 

(1 - 5)||x||' < ||$x||2 < (1 + 5)||x||2 (2.1) 

holds for all S -sparse x G M^. The minimum of all constants 6, satisfying 
(2.1), is called the isometric constant Ssi.^) = ^sij^) = ^s^^i^)- 

In this section we study the OMP and use the "compressed sensing no- 
tation" for the residual of the OMP. Set 

r"^ /m, m > 0. 

Consider the set 

Q = {1,...,7V}. 

Since /o G Ex('Z'), there exists an x = (xi, X2, . . . , xn), suppx = T, T C il, 
\T\^K such that 

rO = /o - $x. 
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Denote by the set of indices of 0j picked by the OMP after m iterations. 
According to the definition of the OMP for every m > we choose x"* e M.^ , 
satisfying the following relations 

suppx"^ C T"', IT"'! = m, while r"^ 0, (2.2) 

G„(/,P) = $x-, 

= $x - ^x™. (2.3) 
Let A^(x, v) be the minimal integer such that 

||xa|P > ly, for all A C T, |A| > A^(x, ly) + 1. (2.4) 

Theorem 2.1. There exists an absolute constant C such that for any S, 
< 6 < 0.001, an integer K > Kq = Ko{d), and a dictionary V, 5^^(I^) < 
S the following statement holds. The OMP recovers exactly every K -sparse 
signal X, ||x||oo < 1, within K + 6N{'x.,CS^^^K) iterations, in other words, 

j.Js:+6iV(x,C(5V2x) ^ Q 

Here is a direct corollary of Theorem 2.1. 

Corollary 2.1. Let K-sparse x be such that \xi\ = 1, i E T , \T\ = K. 
Then under assumptions of Theorem 2. 1 the OMP recovers x exactly within 
(1 + %d5^l'^)K iterations. 

Proof. Set 
We fix 

Consider m G such that 

+ [aK] = m + [aK] < K. (2.6) 

Assume that K > Kq — Ko{a) > 1/a. Let z'^ be the maximal number, 
satisfying the following inequality 

\{i e : \xi\ > > [aK], \{i G : \xi\ < z'^}\ > |r™| - [aK]. (2.7) 

In other words is the [air]th largest element out of {jxiDigr'"- We use 
the following lemma. 



■pm , rp y rpr] 



(2.5) 
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Lemma 2.1. Under (2.6) the following inequality is valid: 

||r™f - ||r™+i' > -Cia). 
Proof. According to (2.7), we can choose sets 

pm ^ pm^ |pm| _ (2.8) 

and 

:= r \ (2.9) 

with the following property 

min \xi\ > > maxlxj. (2.10) 

Consider w e such that 

WTnT'"urip = Xj'nT'^urip , wn^cmT'^ur^p) = 0. (2-11) 
We use several well-known properties of the OMP: 

||j.m||2 _ ||r"»+i||2 > sup(r"*, 0)2, (2.12) 

sup Kr'", 0)1 > l^^]"'^")! ^ ueK"^, (2.13) 
</>ei' ||u||i 

(r"*, $u) = 0, if supp u C T'". (2.14) 

In particular 

(r^", ^x"*) = 0. (2.15) 
Using (2.13) for u = Wf^\T'^, we can estimate 

I sup(r-, 0) I > K^'"''^^"\^'")l (^i^) K^" 
<^ei> ||wn\r'"||i ||wo\T'"||i 

(2£5) |(r"*,$(w-x"*))| ^ |(r"*,$(w-a;"'))| 



1 1 Wn\T'" 111 II Wn\T™ II 0^^ II Wf7\T'» 1 1 2 

Applying (2.8) and (2.11), we obtain from the above inequality 

I sup(r™ 0) I > > l(^^^(w-^-))| .2 16) 

TA^ (irTDV^IIwnxT'-lb - (ai^)V2||wn\T^|h- ^'•'''^ 
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We estimate 

RIP 

||r™||2 = I|$(x-x™)f > (l-5)||x-x™||2 > (l-^jlKx-x'^jr-ll^ 
- (l-5)||xr^||2 = (l-5)||xr^uri"f 

(2.10), (2.8) 

= (l-5)(||xrin||'+||xr^||') > (l-5)((0^[aX] + ||xr^||'), 



and 



||$(w-x)|p ^'=^ ||$xr-||' (l + 5)||xr^ 



Combining two last inequalities, we obtain, for sufficiently large Kq — KQ{a) — 
Me), 

||r"*||2_ ||$(w-x)||2 > (l-5)(^"^f[aX] -2(5||xr-||' 

(2.10) 

> {l-6){z'^f[aK]-25{z'^f\V'!^\ 

(2.9) 

> {l-5){z'^f[aK\-25{z'^fK 

> (1 - 2S){z"'f{aK) - 25{z'^fK 

= {z"'faK{{l - 25) - — ) 

a 

(2.5) 

> (z'")W(l-4a). (2.17) 
Following the technique from [?] we have 

|(r"*,$(w-x"^))| = ^|||$(w-x"^)||^ + ||r"^||2-||$(w-x"*)-r"^||2| 

\ I ||$(w - X-) 11^ + - ||$(w - x)|n I 

(2.17) , ,„ 

> {mw~^-)\\lin'aK{l-4a))'^' 
= ||$(w -x")l|2^™(aX(l -4a))i/2 

RIP 

> (1 - 5)^/lw - x'"||^™(air(l - 4a))^/2 

> \\w -x"'\\z'^{aK)^/\l-Cia) 

> ||(w -x-)f^\T^||z-(aX)V2(i _ cia) 

= ||wn\r-lk"'(ai^)'/'(l-cia). (2.18) 
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Substituting (2.18) in (2.16) and (2.12), we finally get 



aK\\wQ,\Tm\\l aK\\^Q\T"-\\l 



□ 



We continue to prove Theorem 2.1. Without loss of generality we may 
assume that T = {1, . . . , K} and that the sequence decreases. Then 

using the inequality \T fl T"^\ < m and the definition (2.7), we have 

> |2^m+[aJ!']| > |2;m+l+[aJ!']|- 

Applying Lemma 2.1, we have for m > 1, m + [aK] < K, 

- \Kr > (z--'ni - C,a) > x^+[„^](l - C,a). (2.19) 
First we bound ||r^|| from above 

K 



\j.Kn2 ^ 11 0||2 



(rO=#x) 



RIP 



- ^ (llr"^-^^ _ ||r-||2) 

m=l 
K 

m=l 

K K-[aK] 



1=1 m=l 

K K-[aK] 



(2.19) 

< (1 + <^)E^'- E ^^+[aK](l-Cia) 



i=l m=l 
K K 



< (1 + <^)E^'- E ^'(l-Cia) 

i=l i=l+[aK] 



K [aK] 
i=\ 1=1 



1=1 1=1 

<^ {6 + Cia)K + [aK] < KC2a ^ = ^ KC25^'^. (2.20) 
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Then using RIP, we can estimate ||r^|| from below 

||r^ir>(l-5) xl (2.21) 

Set 



1-5 

Combining this definition with (2.20) and (2.21), we obtain 

J2 x1<CK5^'\ 

i&T\TK 

Thus, using (2.4), we conclude that 

|r \ T^l < Ar(x, dK5^'^). (2.22) 
It is known that (see Lemma 1.2 from [?] and Lemma 1 from [?]) 

Then the condition 5 < 0.001 implies that 

5iok{^) < 5i6ir($) < 2752if($) < 275 < 0.03. (2.23) 

Now we can apply the improvement of Zhang's theorem obtained by Wang 
and Shim ([?], Theorem 3.1). It claims that under (2.23) we have 

^K+6\T\Ti^\ ^ 

Therefore, taking into account (2.22), we finally get 

□ 

As corollaries of Theorem 2.1 we obtain Theorem 1.1 and the following 
result. 

Theorem 2.2. For any ei,e2 > there exist S — 6(61,62) > and Kq — 
Ko{6i, 62) such that for any dictionary V, (^^^(P) < 5, K > Kq, the follow- 
ing statement holds. If = /o G T,k{J^) and its nonzero coefficients belong 
to [-1, 1] \ (-61, 61), then rr^(i+^2)l ^ q. 
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Proof. It is clear that for any A C T we have 

llxAf >e?|A|. 
Hence according to (2.4) we get 



^1 



Then 



^ (^^1/2 



Thus, to complete the proof it remains to choose S — S{ei, €2) such that 



<e2. 



□ 



Lemma 2.2. Assume that p < 1 and numbers Xi, 1 < i < K , K > Kq(j)) 
are uniformly distributed on [—1, 1] independent random variables. Then 

\{i : \xi\ <p}\ < 2pK 

with probability greater than 1 — exp{—C{p)K) . 

Proof. For i, 1 < i < K, we set = 0, if \xi\ > p, and = 1, otherwise. So 
has Bernoulli distribution with 

P{6 = 1} = P, P{6 = 0} = l-p, E^, = p. 
By Hoeffding's inequality (see, for instance, [?], p. 197) we obtain 

K 



1=1 



p 



>p> < 2ex-p{-Kp'^/2). 



Clearly, 



Therefore, 



K 



\{i : \xi\ <p}\ = ^^i. 



P{|{i:|x,| <p}| <2p} = p|-i^ei<2p| >l-2exp(-XpV2). 



□ 
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We now give a proof of Theorem 1.1 from the Introduction. 
Proof. Let 

According to Lemma 2.1 with probabihty greater than 1 — exp(— C(5)K) we 
have 

|{i : < x^/3}| < 2x^/^K (2.24) 

To prove the theorem we need to estimate A^(x, xK). Consider A C T" such 
that 



Then we estimate 



|A| = \{i e A:\xi\ < >c^'-^}\ + \{i e A:\xi\> ^c^'-^yi 

■T^K (2.24) 

< |{zG A: <xi/3|| + ^!:^ < 2k^I^k ^ k^'^'K = 'iH^I''K. 
Therefore, by definition (2.4) we have 

To complete the proof it remains to apply Theorem 2.1 for b < 0.001 provid- 
ing 

A^(x, hK) < eK/6. 

□ 

3 Lebesgue-type inequalities 

Wc discuss here the Lcbcsgue-type inequalities for the WCGA with r = {t}, 
t G (0, 1]. We repeat the above assumptions Al and A2 with remarks on the 
corresponding properties of dictionaries. For a given /o let sparse element 
(signal) 

be such that ||/o — /*^|| < e and \T\ — K. For A <zT denote 

/a := /I := ^XiQi. 
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Here are two assumptions that we will use. 

Al. We say that / = YliieT^i9i satisfies the Nikol'skii-type liX inequal- 
ity with parameter r if 

Y,\^i\<Ci\AY\\fAl Act, r>l/2. (3.1) 

ieA 

We say that a dictionary V has the Nikol'skii-typc iiX property with param- 
eters K, r if any fC-sparse element satisfies the Nikol'skii-type iiX inequality 
with parameter r. 

A2. We say that / = "^i^xXigi has incoherence property with parame- 
ters D and U if for any AcT and any A such that An A = 0, |A| |A| < D 
we have for any {cj} 

\\fA-J2''^9^\\>U-'\\fA\\. (3.2) 

We say that a dictionary T> is {K, £))-unconditional with a constant U if for 
any / = '^i^x-'^i9i with \T\ < K inequality (3.2) holds. 

The term unconditional in A2 is justified by the following remark. The 
above definition of {K^ D)-unconditional dictionary is equivalent to the fol- 
lowing definition. Let V be such that any subsystem of D distinct elements 
ei, . . . , e/j from T> is linearly independent and for any A with \A\ < K and 
any coefficients {cj} we have 

D 

II J^QCjll < C/|| ^Qej||. 

ieA i=l 

Let T) be the Riesz dictionary with depth D and parameter 5 G (0, 1). 
This class of dictionaries is a generalization of the class of classical Riesz 
bases. Wc give a definition in a general Hilbcrt space (sec [?], p. 306). 

Definition 3.1. A dictionary T> is called the Riesz dictionary with depth D 
and parameter 6 G (0, 1) if, for any D distinct elements ei,...,eD of the 
dictionary and any coefficients a — (oi, . . . , qd), we have 

D 

{l-S)\\a\\l<\\J2<^^eir<{l + S)\\a\\l. 

i=l 

We denote the class of Riesz dictionaries with depth D and parameter 5 G 
(0,1) byR{D,d). 
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It is clear that the term Riesz dictionary with depth D and parameter 
5 G (0, 1) is another name for a dictionary satisfying the Restricted Isometry 
Property with parameters D and S. The following simple lemma holds. 

Lemma 3.1. Let V e R{D,8) and let Sj e V, j — l,...,s. For f — 
5^i=i ^i^i ^'^^ A (Z {1, . . . , s} denote 



ieA 



saU) ■= 

If s<D then 

II^A(/)f <(l + 5)(l-5)-^ 

Lemma 3.1 imphes that if P G R{D,6) then it is (D, i?) -unconditional 
with a constant U = {1 + 6y/\l - 5)-^/^ 

We need the concept of cotype of a Banach space X. We say that X 
has cotype q > 2 ii for any finite number of elements Ui & X we have the 
inequality 



^Averageill^iiiill^j 




It is known that the Lp spaces with 2 < p < oo have cotype q — p and Lp 
spaces with 1 < p <2 have cotype 2. 

Remcirk 3.1. Suppose V is {K, K) -unconditional with a constant U. As- 
sume that X is of cotype q with a constant Cq. Then V has the Nikol'skii-type 
£iX property with parameters K,l — 1/q and Ci — 2UC~^. 

Proof. Our assumption about {K, ir)-unconditionality implies: for any A, 
\A\ < K, we have 

II ^ixigiill = II ^ XiQi - ^ XiQiW < 2U\\^Xigi\\. 

ieA ieA+ ieA- ieA 

Therefore, by g-cotype assumption 

||5]a:,ft||«>(2C/)-''q^|x,|«. 

ieA ieA 
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This implies 



ieA \ieA / ieA 



□ 



The above proof also gives the following individual function version of 
Remark 3.1. 

Remark 3.2. Suppose f = 'Y^^^j^XiQi has incoherence property with param- 
eters D and U . Assume that X has cotype q with a constant Cg. Then f 
satisfies the Nikol'skii-type iiX inequality with parameter r — 1 — 1/q and 

It is known that a Hilbert space has cotype 2. Therefore, Remark 3.2 
shows that assumption A2 implies assumption Al with r = 1/2. This 
explains how Theorem 1.3 is derived from Theorem 1.2. 

We note that the {K, Ci^)-unconditionality assumption on the dictio- 
nary D in a Hilbert space H is somewhat weaker than the assumption 
V e R{CK,5). Also, our theorems do not assume that the dictionary sat- 
isfies assumptions Al and A2; we only assume that the individual function 
/, a X-sparse approximation of a given /o, satisfies Al and A2. 

In assumption (3.2) we always have U > 1. In the extreme case U — 1 
assumption (3.2) is a strong assumption that leads to strong results. 

Proposition 3.1. Let X be a uniformly smooth Banach space. Assume that 
f = '^i^rj'XiQi, \T\ = K , and the set of indices T has the following property. 
For any g &T> distinct from gi, i & T , and any Ci, c we have 

W^Cigi - cg\\ > \\^Cigi\\. (3.3) 

ieT ieT 

Then the WCGA with ^ 0, k — 1,2,..., recovers f exactly after K 
iterations. 

Proof. It is known (see, for instance, [?], Lemma 6.9, p. 342) that (3.3) 
implies 

F^(^)=0, geV\{gi}i^T. 
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Thus, at the first iteration the WCGA picks ipi e {giji^x- Then /i has the 
form ^i^x^idi ^^"^ repeat the above argument. Then (/?2 G {gi}ieT\{^i\- 
After K iterations all Qi, i e T, will be taken and therefore we will have 
/i^ = 0. □ 

Proposition 3.1 can be apphed in the following situation. Assume that 
^ = {^j}^]^ is a monotone basis for a uniformly smooth Banach space X. 
Then any / = Yld=i ^i'^i ^^^^ ^® recovered by the WCGA after K iterations. 
In particular, this applies to the Haar basis in Lp, 1 < p < oo. 

We now proceed to main results of this section. 

Theorem 3.1. Let X be a Banach space with p{u) < ^v? . Suppose for a 
given fo we have ||/o — /^|| < e with K-sparse f :— satisfying Al and A2. 
Then for any k > we have for K + m < D 

ll/JI<IIM|exp(-^i^|5l^)+2e, 

yjhere Ci := 5^:^. 
Proof. Let 

f-^r = J2''^9i: \T\^K, g^eV. 

ieT 

Denote by T*" the set of indices of gi picked by the WCGA after m iterations, 
--TX T"". Denote by Ai{V) the closure in X of the convex hull of the 
symmetrized dictionary V"^ :— {±g,g G D}. We will bound ||/m|| from 
above. Assume ||/m-i|| > Let m > k. We bound from below 

Sm-^ sup 

Denote := r"^-^ Then 

where ||/a||i X^iga Next, by Lemma 6.9, p. 342, from [?] we obtain 

Ff^_,{fAj = Ff_,{f^) > \\f^_,\\ - e. 

Thus 

Sm>\\fAJ\^\\\fm-l\\-e). 



18 



By (3.1) we get 

\\fAj\l<CM^n\fAj\<C,K^\\fAj\. 

Then 

Prom the definition of the modulus of smoothness we have for any A 

||/m-i-A(^„|| + ||/^_i + A(^^|| <2||/^_i|| (l + p(^^]] (3.5) 



m— 1 1 

and by (1) from the definition of the WCGA and Lemma 6.10 from [?], p. 
343, we get 

1^/^-1 (V'm) I > t sup \Ff^_,{g)\ = 

gev 

t sup \Ff^_,{(j))\ ^ tSm- 

Then either Ff^_-^{(pm) > tSm or Ff^_^{—ipm) > tSm- Both cases are treated 
in the same way. We demonstrate the case Ff^_^{i^rn) > ^-S"^- We have for 
A > 

Wfm-l + A(/Jm|| > F/„_i(/m-l + XiPm) > \\fm-l\\ + >^tSm- 

From here and from (3.5) we obtain 

ll/mll < Wfm-l -X^mW < || + inf (- Ai^^ + 2 ||p(A/||/„_l ||)) . 

We discuss here the case p{u) < ■yu'^. Using (3.4) we get 

n . n n . u f A^ \ eXt 

ll/mll < ll/m-lll 1 - ^r.rU, II + 



Let Ai be a solution of 



= ^7777 fl^' Ai = 



Our assumption (3.2) gives 

= ii(r-G™-i)A„ii <f/|ir-G„-iii 

< UiWfo - Grn-iW + ll/o - ril) < U{\\U-i\\ + e). 
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Specify 

t\\fAj\ 



Then, using > e we get 



A 11/ 



-^m 1 1 -j^ 



Ai 4||/„_i||2 
and obtain 

et 



WfmW < \\fm-l\\ { 1 32^C'2C/2X2r J + IQ^CfU^K''^ 



Denote ci := 32^^^- Then 



ll/JI<Mexp(-^i^|^K2e. 



□ 



Theorem 3.2. Let X be a Banach space with p{u) < ju^. Suppose K-sparse 
satisfies Al, A2 and ||/o — /^|| < e- Then the WCGA with weakness 
parameter t applied to /q provides 

\\fcit,^,c,)uHniu+i)K^r\\<CUe for K + C{t,j,Ci)UHn{U + 1)K^'' < D 

with an absolute constant C and C{t, 7, Ci) = C27C^t~^. 

We formulate an immediate corollary of Theorem 3.2 with e = 0. 

Corollary 3.1. Let X be a Banach space with p{u) < ■yu'^. Suppose K-sparse 
f satisfies Al, A2. Then the WCGA with weakness parameter t applied to f 
recovers it exactly after C{t, 7, Ci)U^ \n{U + 1)K^^ iterations under condition 
K + C{t, 7, Ci)C/2 \n{U + 1)K'^^ < D. 

Proof We use the above notations T"* and T"^ := T \ T"*. Let A; > be 
fixed. Suppose 

For J = 1, 2, . . . , n, + 1 consider the following pairs of sets Aj^ Bj: A^^i = 
T'', Bn+i = 0; for j < n, Aj := V'' \ Bj with Bj C is such that \Bj\ > 
|r^| — 2^'^ and for any set J C with \ J\ > \r^\ — 2^^^ we have 

\\fBM<\\fj\\. 
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We note that this imphes that if for some Q CV'^ we have 

II/qII < II/bJI then |g|<|rV2^-'- (3-6) 

For a given 6 > 1, to be specified later, denote by L the index such that 
{Bo r'^) 

\\fBj<b\\fBA, 

II/bJ|<&||/b.||, 

||/b^_J| <&||/b^-iII, 
\\fB,.A>b\\fBA. 

Then 

||/b,||<6^-'-^||/b,_J|, i = l,2,...,L. (3.7) 

We now proceed to a general step. Let m > k and let A,BcV'' be such 
that A — r'^\B. As above we bound Sjn from below. It is clear that -S^ > 0. 
Denote A^:^An r"*-^. Then 

Sm > Ffm-lifAm/\\fAra\\l)- 

Next, 

Then /^^ + /s = /' - /a with Ff^_,{fA) = 0. Moreover, it is easy to see 
that > - e. Therefore, 

Ff^-AfAr. + fB- fs) > \\fm-i\\ - e - UbI 

Thus 

Sm > WfAji' max(0, - e - 

By (3.1) we get 

\\fAjl<CMmmAj\<CMnfAj\. 

Then 

Q ^ ||/m-l|| - ||/b|| - £ /o o\ 
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Prom the definition of the modulus of smoothness we have for any A 

Wfm-l - X^mW + + X^m\\ < 2||/r„-l||(l + pilTT^)) 

\\jm-l\\ 

and by (1) from the definition of the WCGA and Lemma 6.10 from [?], p. 
343, we get 

\Ffrn-iiVm)\ > t sup \Ff^_,{g)\ = 



t sup \Ff^_M)\- 
<l>eAi{v) 



Prom here we obtain 



ll/mll < + inU-XtSm + 2||/„_i ||p(A/ | 

We discuss here the case p{u) < •ju'^. Using (3.8) we get 

At(||/B| 



Let Ai be a solution of 

Xt o A' X ^11/" 



27777 fT^' '^l 



2Ci|^N|/a.|| '||/^-i|r 47Ci|^M|/a„ 
Our assumption (3.2) gives 

II/a.II <t/(||/..-i|| + e). 

Specify 

^ tWfAj 



16-fCi\A\'U^' 
Then A < Ai and we obtain 

ll/mll < ||/m-l|| (^1 - as^C-ft/^l^pr) + ijrlcf\A\^rU2- (3-9) 



Denote ci := g^^^^^a and C2 := j^:^Tfj2- This implies for m2 > mi > k 



wuw < ii/mji(i-c,/|A|^-)""'"^ + '"^"^iS'h wfBii+e)- 
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Define mo := k and, inductively, 



ruj = rrij^i + /3\Aj 



2r 



j = l,...,n. 



At iterations from m^-i + 1 to rrij we use A — Aj and obtain from (3.9) that 



We bound the ||/fc||. It follows from the definition of fk that is the error 
of best approximation of /o by the subspace Representing fo — f + fo — f 

we see that ||/fc|| is not greater than the error of best approximation of / by 
the subspace $fc plus ||/o — /||. This implies < ||/bo|| + e. Therefore 
using (3.7) we continue 



We will specify ^ later. However, we note that it will be chosen in such a 
way that guarantees rj < 1/2. Choose b — Then 




L 



\\fmJ<\\fk\\v'^ + 2j2i\\fBM+e)v''-^. 



L 



< (II/boII + e)v' + 2 5](||/s,_J|(r/6)"-^6-^ + erj^-^) 




||/„^J|<||/B,_J|8e--^ + 4e. 



(3.10) 



By (3.2) we get 



ll/r"-. II < C/(||/n.J| + e) < C/(||/B,_J|8e-^^ + 5e). 



If ||/i3i_ill < lOC/e then by (3.10) 



\\fmA<CUe. 
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If > lOUe then making P sufficiently large to satisfy 16Ue '^'^^ < 1 

so that ^ = C^ln{U+l) ^ 



t^(ll/i?.-J|8e--^ + 56)<||/B,_J| 

and therefore 

This implies (see (3.6) 

l"pmi^l ^ |r^| 2^~^ 

We begin with /q and apply the above argument (with /c = 0). As a result we 
either get the required inequality or we reduce the cardinality of support of / 
from |r| = Kio jr^^i | < \T\ -2^^''^, ttil, < We continue the process 

and build a sequence rriLj such that rriLj < /32^^^^ and after rriLj iterations 
we reduce the support by at least 2^^^^. We also note that rriLj < /32'^'^ K'^'^ . 
We continue this process till the following inequality is satisfied for the first 
time 

+ --- + mL„ >2^W2'-. (3.11) 

Then, clearly, 

m^, + --- + mi„<2^'-+Vi^2r_ 

Using the inequality 

(ai + • • • + ttnY <al + --- + al, > 0, 9e (0, 1] 
we derive from (3.11) 

2^-1-2 _^ ^ 2^"^^ > (2^'^i^^~^^ -\ \- 22'-{'f'n-2)^ 

_1_ 

> 2~^ (2^^"^^ + • • • + 2^^^") 

> ((/3)-i(mL, + ■ ■ ■ + m^J)^ > /r. 

Thus, after not more than := 2'^^~^^PK'^^ iterations we recover / exactly 
and then ||/^|| < ||/o - /|| < e. 

□ 

Theorem 1.2 from the Introduction follows from Theorems 3.2 and 3.1. 
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4 Discussion 



We begin with presenting some known results about exact recovery and the 
Lebesgue-typc incquahties for incoherent dictionaries. In this case we use an- 
other natural generalization of the WOMP. This generalization of the WOMP 
was introduced in [?] . In the paper [?] we proved Lebesgue-type inequalities 
for that algorithm. We now formulate the corresponding results. We recall a 
generalization of the concept of M-coherent dictionary to the case of Banach 
spaces (see, for instance, [?]). 

Let "D be a dictionary in a Banach space X. The coherence parameter of 
this dictionary is defined as 



In general, a norming functional Fg is not unique. This is why we take 
sup^_^ over all norming functionals of g in the definition of M{V). We do 
not need sup^^ in the definition of M{I)) if for each g ^T) there is a unique 
norming functional Fg G X* . Then we define T)* := {Fg,g G V} and call 
V* a dual dictionary to a dictionary T>. It is known that the uniqueness of 
the norming functional Fg is equivalent to the property that g is a. point of 
Gateaux smoothness: 



for any y E X. In particular, if X is uniformly smooth then Ff is unique for 
any f ^ 0. We considered in [?] the following greedy algorithm which gen- 
eralizes the Weak Orthogonal Greedy Algorithm to a Banach space setting. 
Weak Quasi-Orthogonal Greedy Algorithm (WQOGA). Let t G 



M{V) : 



sup sup|Fg(/i)|. 

g^h;g,heV Fg 



lim{\\g + uy\\ + \\g - uy\\ - 2\\g\\)/u = 




gev 



Next, we find ci satisfying 



i^^i(/-ci(^i) = 0. 



Denote /i := /f'* := 



25 



We continue this construction in an inductive way. Assume that we 
have already constructed residuals /q, /i, . . . , /m-i and dictionary elements 
^1, ■ ■ ■ , Vm-i- Now, we pick an element iprn '■= ^ ^ snch. that 

\F^Mm-i)\>tsuv\Fg{f^_{)\. 
Next, we look for c^, . . . , satisfying 

m 

^v^.(/-E^>^) = 0' i = (4-1) 

1=1 

If there is no solution to (4.1) then we stop, otherwise we denote Gm '■ = 
G'ri - YZi ^T^i and := := / - with c^, ■ ■ ■ , C satisfying (4.1). 

Remark 4.1. Note that has a unique solution if det{F^.{ipi))'^^^-^ ^ 

0. Applying the WQOGA in the case of a dictionary with the coherence 
parameter M := M{T>) gives, by a simple well known argument on the linear 
independence of the rows of the matrix {F^.{ipi))^j^^, the conclusion that 
(4-.1) has a unique solution for any m < 1 + 1/M. Thus, in the case of 
an M -coherent dictionary V, we can run the WQOGA for at least [1/M] 
iterations. 

In the case i = 1 we call the WQOGA the Quasi- Orthogonal Greedy 
Algorithm (QOGA). In the case of QOGA we need to make an extra as- 
sumption that the corresponding maximizer c/?^ e V exists. Clearly, it is the 
case when T) is finite. 

It was proved in [?] (see also [?], p. 382) that the WQOGA is as good as 
the WOMP in the sense of exact recovery of sparse signals with respect to 
incoherent dictionaries. The following result was obtained in [?] . 

Theorem 4.1. Let t G (0, 1]. Assume that T) has coherence parameter M . 
Let K < Y^(l + ^/M). Then for any fo of the form 

K 

fo = (^iQi, 
i=l 

where Qi are distinct elements ofV, the WQOGA recovers it exactly after K 
iterations. In other words, fj^ — 0. 
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It is known (see [?], pp. 303-305) that the bound X < |(1 + 1/M) is 
sharp for exact recovery by the OGA. 

We introduce a new norm, associated with a dictionary 25, by the formula 

||/|h-sup|F,(/)|, fex. 
gev 

We define best m-term approximation in the norm Y as follows 

(Xm(/)y:= inf ||/-c/||y. 

In [?] the norm Y was either the norm X of our Banach space or the norm 
II • ||d defined above. The following two Lebesgue-type inequalities were 
proved in [?]. 

Theorem 4.2. Assume that V is an M-coherent dictionary. Then for 
< have for the QOGA 

WfrnWv < l?,.f>(Jm{f)v. (4.2) 

Theorem 4.3. Assume that V is an M-coherent dictionary in a Banach 
space X. There exists an absolute constant C such that, for m < 1/(3M), 
we have for the QOGA 

WfmWx < C inf (11/ - g\\x + mil/ - g\\v). 

Corollary 4.1. Using the inequality \\g\\v < llfl'llx; Theorem 4-3 obtains 

\\fm\\x<Cil + m)amif)x. 

Inequality (4.2) is a perfect (up to a constant 13.5) Lebesgue-type in- 
equality. It indicates that the norm || ■ ||x) used in [?] is a suitable norm for 
analyzing performance of the QOGA. GoroUary 4.1 shows that the Lebesgue- 
type inequality (4.2) in the norm || • \\x> implies the Lebesgue-type inequality 
in the norm || ■ 

Thus, results of this paper complement the above discussed results from 
[?] and [?]. Results from [?] and [?] deal with incoherent dictionaries and use 
the QOGA for exact recovery and the Lebesgue-type inequahties. Results of 
this paper deal with dictionaries which satisfy assumptions Al and A2 and 
we analyze the WGGA here. In the case of a Hilbert space, assumptions Al 
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and A2 are satisfied if T> has RIP. It is well known that the RIP condition is 
much weaker than the incoherence condition in the case of a Hilbert space. 
It is interesting to note that we do not know how the coherence parameter 
M{V) is related to properties Al and A2 in the case of a Banach space. 

We now give a few apphcations of Theorem 1.2 for specific dictionaries 
v. We begin with the case when D is a basis ^ for X. In some of our 
examples we take X — Lp, 2 < p < oo. Then it is known that p{u) < ^v? 
with 7 = (p- l)/2. 

Example 1. Let X be a Banach space with p{u) < 'ju^ and with cotype 
q. Let ^ be a normalized in X unconditional basis for X. Then U < C{X, ^). 
By Remark 3.1 ^ satisfies Al with r — 1 — K Theorem 1.2 gives 

II <CaK(/o,^). (4.3) 

We note that (4.3) provides some progress in Open Problem 7.1 (p. 91) from 
[?]• 

Example 2. Let ^ be a uniformly bounded orthogonal system normal- 
ized in Lp{D,), 2 < p < oo, is a bounded domain. Then we can take 
r — 1/2. The inequality 

\\g\\p < CXV2-1/P||^||^ 

for A'-sparse g implies that 

Therefore U < CK^^^'^^p. Theorem 1.2 gives 

\\fc{t,p,D)KVp' In kWp < C(7x(/o,^)p. (4.4) 

Inequality (4.4) provides some progress in Open Problem 7.2 (p. 91) from 
[?]• 

Theorem 1.2 can also be apphed for quasi-greedy bases and other greedy- 
type bases (see [?]). We plan to discuss these applications in detail in our 
future work. 

In this paper we limit ourselves to the case of Banach spaces satisfying 
the condition p{u) < 7^^. In particular, as we mentioned above the Lp spaces 
with 2 < p < 00 satisfy this condition. Clearly, the Lp spaces with 1 < p < 2 
are also of interest. For the clarity of presentation we do not discuss the case 
p{u) < 7M^ in this paper. The technique from Section 3 works in this case 
too and we will present the corresponding results in our future work. 
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